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PREFACE 


In  this  Memorandum  the  author  employs  the  mathematical 
technique  of  dynamic  programming  to  obtain  a  best— fit 
approximation  to  a  function  that  is  defined  over  some 
given  interval.  He  then  describes  how  this  method  offers 
an  approach  to  the  handling  of  a  certain  type  of 
pattern— recognition  problem  and  to  the  approximation  of 
optimal  control  policies. 
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SUMMARY 

The  problem  we  start  with  appears  to  be  quite 
specialized.  Given  a  function  u(t)  defined  over  the 
interval  [0,a],  we  wish  to  find  a  polygonal  approxi¬ 
mation  which  is  a  best  fit  in  a  mean— square  sense.  The 
analytic  problem  for  N  is  that  of  minimizing  the 
function 


N— 1  . . .  « 

RN  -  E  L+i  (u(t)  -  a.  -  b.t)2dt 
N  i=0  11 

ci 

over  the  quantities  a^,  b^,  and  t^.  Here  tp  “  0, 
tN  -  a. 

This  can  be  treated  in  a  number  of  direct  fashions, 
using  search  and  gradient  techniques.  We  wish,  however, 
to  employ  dynamic  programming,  which  appears  to  be 
superior  even  in  this  case,  and  then  gradually  to 
enlarge  the  scope  of  the  problem  until  it  covers  a 
question  in  the  identification  of  systems  and  a  version 
of  the  general  problem  of  considering  suboptimal  policies 
in  control  processes. 
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DYNAMIC  PROGRAMMING,  SYSTEM  IDENTIFICATION, 
AND  SUBOPTIMIZATION 


1.  INTRODUCTION 

The  problem  we  start  with  appears  to  be  quite 
specialized.  Given  a  function  u(t)  defined  over  the 
interval  [0,a],  we  wish  to  find  a  polygonal  approxi¬ 
mation  which  is  a  best  fit  in  a  mean— square  sense.  (See 
Fig.  1.)  The  analytic  problem  for  N  is  that  of 
minimizing  the  function 


(1.1) 


N— 1 

r 

i-0 


(u  ( t )  -  a£ 


bit)2dt 


over  the  quantities  a^,  b^,  and  t^.  Here  tg  =  0, 


This  can  be  treated  in  a  number  of  direct  fashions, 
using  search  and  gradient  techniques.  We  wish,  however, 
to  employ  dynamic  programming,  which  appears  to  be 
superior  even  in  this  case,  and  then  gradually  to 
enlarge  the  scope  of  the  problem  until  it  covers  a 
question  in  the  identification  of  systems  and  a  version 
of  the  general  problem  of  considering  suboptimal  policies 
in  control  processes.  ResuLts  related  to  what  follows 
have  been  presented  in  [1,2,3]. 
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2.  ADAPTIVE  CURVE  FITTING 

The  foregoing  problem  can  be  considered  to  fall 
within  the  new  area  of  sequential  computation.  In  place 
of  choosing  the  t^  in  advance,  we  allow  the  structure 
of  the  function  u(t)  to  determine  their  positions. 
Similar  techniques  can  be  applied  in  connection  with  the 
numerical  integration  of  ordinary  and  partial 
differential  equations.  Write 

(2.1)  min  RN  -  fN(a), 

(ai jb^ , t^) 

defined  for  N  ■  C,  1,2,...,  and  a  >.  0.  Introduce  the 
function  of  two  variables, 

g 

(2.2)  A(s-l,S2)  «  min  J  2  (u(t)  —  a  —  bt)2dt, 

a,b 

S1 

for  0  <  S|  <  S2  <  od  .  That  this  happens  in  this  case 
to  be  explicitly  calculable  is  of  no  particular 
significance  at  the  moment.  In  general,  this  function 
will  be  obtained  via  numerical  methods. 

Then 


(2.3)  fQ(a)  -  A(0,a), 


and  the  principle  of  optimality  yields  the  recurrence 
relation 


-A 


(2.4)  fN(a)  -  min  [A(tN,a)  + 

N  0<t^<a 

for  N  >  1. 

This  leads  to  a  quite  simple  and  efficient 
computational  algorithm. 


3.  DISCUSSION 


Perhaps  the  first  point  to  note  in  connection  with 
what  has  been  given  above  is  that  the  computational 
feasibility  of  the  algorithm  inherent  in  (2.4)  is  not 
strongly  dependent  upon  the  mean— square  norm  in  (2.2). 
We  could  just  as  easily  use 


(3.1)  A(s1  ,s„)  •=  min  max  |u(t)  —  a  —  bt| , 

1  z  a,b  s^<t<S2 

or  allow  approximation  by  polynomials  of  higher  degree. 
This  brings  us  into  contact  with  the  theory  of  spline 
approximations,  but  we  shall  not  pursue  that  here;  see 
[4]  for  an  extensive  set  of  references. 

As  soon  as  we  start  pursuing  the  idea  of  approxi¬ 
mating  to  u(t)  over  the  Interval  [s^,S2l  by  a 
function  of  simple  analytic  form,  we  enter  the  domain 
of  differential  approximation  [ 5 ]  •  We  recognize  that  a 
polynomial  of  degree  M  satisfies  the  differential 
equation 


(3.2) 


(M+l) 


dt 


WFI" 


0, 


that  the  exponential  polynomial 
the  differential  equation 


M  It 

T.  a,  e  satisfies 
k=l  K 


d(M)v 

dV® 


+  b. 


d^v  , 

^CH-TT 


4. 


bMv  =  0, 


(3.3) 


•  •  • 


M 

and  that 

k 

equation  of  degree  2M.  It  follows  that  a  substantial 
extension  of  straight-line  approximation  is  the 
following.  Determine  the  parameters  a^  and  initial 
conditions  so  that 


(3.4)  ||u  -  v | 


E  a.  cos 


V 


satisfies  a  similar 


is  minimized,  where  u  is  given  and  v  is  determined 
by  the  ordinary  differential  equation 


(3.5) 


d(M)v 


y  •  •  •  s 


dO«>v  a 
d^f=IT'ai 


><>*•) 


v^^(0)  =  ct,  i  =  0,1,..., M  —  1.  Here,  we  can  use  a 
mean— square  norm,  or  some  other  convenient  norm. 

Problems  of  this  nature  can  be  attacked  by  means  of 
quasilinearization  and  other  techniques  [5]. 
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4.  IDENTIFICATION  OF SYSTEMS 

The  foregoing  remarks  and  techniques  allow  us  to 
approach  an  interesting  problem  in  the  identification  of 
systems.  Suppose  that  we  know  that  a  function  u(t)  is 
generated  in  the  following  manner.  In  the  interval 


Ci  <  <  'i+l- 


^  4  < 


^  fcn+l' 


'0 


0, 


'N+l 


a0’ 


it  satisfies  the  equation 


(4.1) 


t,v, 


•  •  • 


d^v 

'^TS=rj' 


V 


(j) 


0 , 1, . . .  ,M  —  1. 


Given  the  values  of  u(t)  in  [0,a],  we  wish  to 
determine  the  vector  parameters  a^,  the  parameters  c±y 
and  the  switching  points  t^,  and  occasionally  N 
itself.  This  is  a  particular  type  of  pattern  recognition 
problem. 

We  begin  by  introducing  the  function 

(4.2)  A(s^,S2)  *=  min  f  ^  (u  —  v)^dt, 

a,c  i  s 
J  sL 

where  v/(t)  satisfies  (4.1),  0  <  s^  <  S2  <  a.  Our 

assumption  is  that  we  can  compute  this  function  of  two 
variables.  This  will  in  general,  however,  be  a 
nontrivial  task.  If  then  we  introduce  the  function 
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(4.3) 


fN(a) 


min 
{ a^ i c ^ j 


J  (u  -  v)2dt, 
0 


a  >  0,  allowing  N  switch  points,  or  transition  points, 
we  obtain  exactly  the  same  recurrence  relation  as  in 
(2.4).  If  u(t)  is  actually  determined  by  (4.1),  we 
will  have  fN(aQ)  «=  0  for  the  correct  choice  of  tN» 


5.  SUBOPTIMIZATION 

For  analytic,  economic,  and  engineering  convenience, 
it  is  often  useful  to  consider  the  approximation  of 
optimal  control  policies  by  simple,  feasible  control 
policies 

Thus,  for  example,  in  the  minimization  of 

rT 

(5.1)  J(u)  «=  i  g(u,u')dt,  u (0)  «  c, 

b 

0 

we  may  wish  to  consider  as  admissible  functions  only 
those  for  which 

(5.2)  u'(t)  *  bt,  st  <  t  <  si+1, 

with  sA  =  0,  sM,.  =  T,  where  the  b.  and  s.  are  to 
0  N+l  l  l 

be  chosen. 

Let  us  define 


(5.3;  fN(T,c)  •=  min  J(u), 


where  the  minimum  is  now  over  the  class  of  suboptimal 
policies  defined  above.  Then,  as  before,  the  principle 
of  optimality  yields  the  relation 


(5.4) 


£n(T,c) 


min 

b0’sl 


|_J  1  g(u(bQ,  t),bQ)dt 
0 


+  ^N-1^t  ~  si>u(bQ,s1)) 


t 
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for  N  >  1,  with 


rT 

fQ  (Tjc)  «=  min  J  g (u(bQ, t) ,bQ)dt. 
b0  0 


Here  u(bQ,t)  denotes  the  function  over  the  relevant 
t— interval  determined  by  the  nature  of  the  suboptimal 
policy  and  the  initial  state  c.  In  this  case, 
u(bQ,t)  =  c  +  bQt . 


6.  REDUCTION  OF  DIMENSIONALITY 

One  of  the  purposes  of  using  suboptimal  policies  is 
to  bypass  some  of  the  analytic  and  computational 
difficulties  of  the  original  optimization  problem.  This 
is  particularly  the  case  when  we  have  a  control  process 
involving  either  a  high— dimensional  state  vector,  or  an 
inf ini te— dimensional  vector. 

In  this  situation,  we  can  often  replace  the  actual 
state  vector  at  time  t  by  a  record  of  the  control 
policies  used,  and  thus  obtain  a  more  manageable 
computational  algorithm.  Furthermore,  we  can  use  new 
types  of  approximation  methods.  For  a  detailed 
discussion  of  this  technique,  see  [6], 
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